{ "metadata": { "kernelspec": { "name": "python", "display_name": "Python (Pyodide)", "language": "python" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.7" } }, "nbformat_minor": 4, "nbformat": 4, "cells": [ { "cell_type": "markdown", "source": "## Textual data combination and pre-processing\n\n### Combine the lyrics of the year\n\nThe goal of our analysis is to capture the broad trends and general shifts in lyrical content, so it could be more beneficial to combine the preivious scraped lyrics into one single text representing specific year, instead of dealing with almost 100 seperate texts for each year. Here are the pros and cons of our choice. After finishing the general topic modelling, I'll evaluate the result qualitively and decide how should I deal with the latent problems of combing our lyrics yearly.\n\n#### Pros:\n